    Universal Communication, Universal Graphs, and Graph Labeling

    We introduce a communication model called universal SMP, in which Alice and Bob receive a function f belonging to a family ?, and inputs x and y. Alice and Bob use shared randomness to send a message to a third party who cannot see f, x, y, or the shared randomness, and must decide f(x,y). Our main application of universal SMP is to relate communication complexity to graph labeling, where the goal is to give a short label to each vertex in a graph, so that adjacency or other functions of two vertices x and y can be determined from the labels ?(x), ?(y). We give a universal SMP protocol using O(k^2) bits of communication for deciding whether two vertices have distance at most k in distributive lattices (generalizing the k-Hamming Distance problem in communication complexity), and explain how this implies a O(k^2 log n) labeling scheme for deciding dist(x,y) ? k on distributive lattices with size n; in contrast, we show that a universal SMP protocol for determining dist(x,y) ? 2 in modular lattices (a superset of distributive lattices) has super-constant ?(n^{1/4}) communication cost. On the other hand, we demonstrate that many graph families known to have efficient adjacency labeling schemes, such as trees, low-arboricity graphs, and planar graphs, admit constant-cost communication protocols for adjacency. Trees also have an O(k) protocol for deciding dist(x,y) ? k and planar graphs have an O(1) protocol for dist(x,y) ? 2, which implies a new O(log n) labeling scheme for the same problem on planar graphs

    Randomized Communication and Implicit Representations for Matrices and Graphs of Small Sign-Rank

    We prove a characterization of the structural conditions on matrices of sign-rank 3 and unit disk graphs (UDGs) which permit constant-cost public-coin randomized communication protocols. Therefore, under these conditions, these graphs also admit implicit representations. The sign-rank of a matrix M{±1}N×NM \in \{\pm 1\}^{N \times N} is the smallest rank of a matrix RR such that Mi,j=sign(Ri,j)M_{i,j} = \mathrm{sign}(R_{i,j}) for all i,j[N]i,j \in [N]; equivalently, it is the smallest dimension dd in which MM can be represented as a point-halfspace incidence matrix with halfspaces through the origin, and it is essentially equivalent to the unbounded-error communication complexity. Matrices of sign-rank 3 can achieve the maximum possible bounded-error randomized communication complexity Θ(logN)\Theta(\log N), and meanwhile the existence of implicit representations for graphs of bounded sign-rank (including UDGs, which have sign-rank 4) has been open since at least 2003. We prove that matrices of sign-rank 3, and UDGs, have constant randomized communication complexity if and only if they do not encode arbitrarily large instances of the Greater-Than communication problem, or, equivalently, if they do not contain arbitrarily large half-graphs as semi-induced subgraphs. This also establishes the existence of implicit representations for these graphs under the same conditions.Comment: 28 page

    Testing, Learning, Sampling, Sketching

    We study several problems about sublinear algorithms, presented in two parts. Part I: Property testing and learning. There are two main goals of research in property testing and learning theory. The first is to understand the relationship between testing and learning, and the second is to develop efficient testing and learning algorithms. We present results towards both goals. - An oft-repeated motivation for property testing algorithms is to help with model selection in learning: to efficiently check whether the chosen hypothesis class (i.e. learning model) will successfully learn the target function. We present in this thesis a proof that, for many of the most useful and natural hypothesis classes (including halfspaces, polynomial threshold functions, intersections of halfspaces, etc.), the sample complexity of testing in the distribution-free model is nearly equal to that of learning. This shows that testing does not give a significant advantage in model selection in this setting. - We present a simple and general technique for transforming testing and learning algorithms designed for the uniform distribution over {0, 1}^d or [n]^d into algorithms that work for arbitrary product distributions over R d . This leads to an improvement and simplification of state-of-the-art results for testing monotonicity, learning intersections of halfspaces, learning polynomial threshold functions, and others. Part II. Adjacency and distance sketching for graphs. We initiate the thorough study of adjacency and distance sketching for classes of graphs. Two open problems in sublinear algorithms are: 1) to understand the power of randomization in communication; and 2) to characterize the sketchable distance metrics. We observe that constant-cost randomized communication is equivalent to adjacency sketching in a hereditary graph class, which in turn implies the existence of an efficient adjacency labeling scheme, the subject of a major open problem in structural graph theory. Therefore characterizing the adjacency sketchable graph classes (i.e. the constant-cost communication problems) is the probabilistic equivalent of this open problem, and an essential step towards understanding the power of randomization in communication. This thesis gives the first results towards a combined theory of these problems and uses this connection to obtain optimal adjacency labels for subgraphs of Cartesian products, resolving some questions from the literature. More generally, we begin to develop a theory of graph sketching for problems that generalize adjacency, including different notions of distance sketching. This connects the well-studied areas of distance sketching in sublinear algorithms, and distance labeling in structural graph theory

    Halfway to Halfspace Testing

    In this thesis I study the problem of testing halfspaces under arbitrary probability distributions, using only random samples. A halfspace, or linear threshold function, is a boolean function f : Rⁿ → {±1} defined as the sign of a linear function; that is, f(x) = sign(Σᵢ wᵢxᵢ - θ) where we refer to w ∈ Rⁿ as a weight vector and θ ∈ R as a threshold. These functions have been studied intensively since the middle of the 20th century; they appear in many places, including social choice theory (the theory of voting rules), circuit complexity theory, machine learning theory, hardness of approximation, and the analysis of boolean functions. The problem of testing halfspaces, in the sense of property testing, is to design an algorithm that, with high probability, decides whether an unknown function f is a halfspace function or far from a halfspace, using as few examples of labelled points (x, f (x)) as possible. In this work I focus on the problem of testing halfspaces using only random examples drawn from an arbitrary distribution, and the algorithm cannot choose the points it receives. This is in contrast with previous work on the problem, where the algorithm can query points of its choice, and the distribution was assumed to be uniform over the boolean hypercube. Towards a solution to this problem I present an algorithm that works for rotationally invariant probability distributions (under reasonable conditions), using roughly O(√n) random examples, which is close to the known lower bound of Ω(√n/ √log n) . I further develop the algorithm to work for mixtures of two such rotationally invariant distributions and provide a partial analysis. I also survey related machine learning results, and conclude with a survey of the theory of halfspaces over the boolean hypercube, which has recently received much attention

    Graphs with minimum fractional domatic number

    The domatic number of a graph is the maximum number of vertex disjoint dominating sets that partition the vertex set of the graph. In this paper we consider the fractional variant of this notion. Graphs with fractional domatic number 1 are exactly the graphs that contain an isolated vertex. Furthermore, it is known that all other graphs have fractional domatic number at least 2. In this note we characterize graphs with fractional domatic number 2. More specifically, we show that a graph without isolated vertices has fractional domatic number 2 if and only if it has a vertex of degree 1 or a connected component isomorphic to a 4-cycle. We conjecture that if the fractional domatic number is more than 2, then it is at least 7/3

    Randomized Communication and Implicit Graph Representations

    We study constant-cost randomized communication problems and relate them to implicit graph representations in structural graph theory. Specifically, constant-cost communication problems correspond to hereditary graph families that admit constant-size adjacency sketches, or equivalently constant-size probabilistic universal graphs (PUGs), and these graph families are a subset of families that admit adjacency labeling schemes of size O(log n), which are the subject of the well-studied implicit graph question (IGQ). We initiate the study of the hereditary graph families that admit constant-size PUGs, with the two (equivalent) goals of (1) understanding randomized constant-cost communication problems, and (2) understanding a probabilistic version of the IGQ. For each family F\mathcal F studied in this paper (including the monogenic bipartite families, product graphs, interval and permutation graphs, families of bounded twin-width, and others), it holds that the subfamilies HF\mathcal H \subseteq \mathcal F are either stable (in a sense relating to model theory), in which case they admit constant-size PUGs, or they are not stable, in which case they do not. The correspondence between communication problems and hereditary graph families allows for a new method of constructing adjacency labeling schemes. By this method, we show that the induced subgraphs of any Cartesian products are positive examples to the IGQ. We prove that this probabilistic construction cannot be derandomized by using an Equality oracle, i.e. the Equality oracle cannot simulate the k-Hamming Distance communication protocol. We also obtain constant-size sketches for deciding dist(x,y)k\mathsf{dist}(x, y) \le k for vertices xx, yy in any stable graph family with bounded twin-width. This generalizes to constant-size sketches for deciding first-order formulas over the same graphs

    Multi-messenger observations of a binary neutron star merger

    On 2017 August 17 a binary neutron star coalescence candidate (later designated GW170817) with merger time 12:41:04 UTC was observed through gravitational waves by the Advanced LIGO and Advanced Virgo detectors. The Fermi Gamma-ray Burst Monitor independently detected a gamma-ray burst (GRB 170817A) with a time delay of ~1.7 s with respect to the merger time. From the gravitational-wave signal, the source was initially localized to a sky region of 31 deg2 at a luminosity distance of 40+8-8 Mpc and with component masses consistent with neutron stars. The component masses were later measured to be in the range 0.86 to 2.26 Mo. An extensive observing campaign was launched across the electromagnetic spectrum leading to the discovery of a bright optical transient (SSS17a, now with the IAU identification of AT 2017gfo) in NGC 4993 (at ~40 Mpc) less than 11 hours after the merger by the One- Meter, Two Hemisphere (1M2H) team using the 1 m Swope Telescope. The optical transient was independently detected by multiple teams within an hour. Subsequent observations targeted the object and its environment. Early ultraviolet observations revealed a blue transient that faded within 48 hours. Optical and infrared observations showed a redward evolution over ~10 days. Following early non-detections, X-ray and radio emission were discovered at the transient’s position ~9 and ~16 days, respectively, after the merger. Both the X-ray and radio emission likely arise from a physical process that is distinct from the one that generates the UV/optical/near-infrared emission. No ultra-high-energy gamma-rays and no neutrino candidates consistent with the source were found in follow-up searches. These observations support the hypothesis that GW170817 was produced by the merger of two neutron stars in NGC4993 followed by a short gamma-ray burst (GRB 170817A) and a kilonova/macronova powered by the radioactive decay of r-process nuclei synthesized in the ejecta

    Genome-wide structural variant analysis identifies risk loci for non-Alzheimer’s dementias

    We characterized the role of structural variants, a largely unexplored type of genetic variation, in two non-Alzheimer’s dementias, namely Lewy body dementia (LBD) and frontotemporal dementia (FTD)/amyotrophic lateral sclerosis (ALS). To do this, we applied an advanced structural variant calling pipeline (GATK-SV) to short-read whole-genome sequence data from 5,213 European-ancestry cases and 4,132 controls. We discovered, replicated, and validated a deletion in TPCN1 as a novel risk locus for LBD and detected the known structural variants at the C9orf72 and MAPT loci as associated with FTD/ALS. We also identified rare pathogenic structural variants in both LBD and FTD/ALS. Finally, we assembled a catalog of structural variants that can be mined for new insights into the pathogenesis of these understudied forms of dementia